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Abstract 

Utility based methods provide a very general theoretically consis¬ 
tent approach to pricing and hedging of securities in incomplete finan¬ 
cial markets. Solving problems in the utility based framework typically 
involves dynamic programming, which in practise can be difficult to 
implement. This article presents a Monte Carlo approach to optimal 
portfolio problems for which the dynamic programming is based on 
the exponential utility function U(x) = — exp(— x). The algorithm, 
inspired by the Longstaff-Schwartz approach to pricing American op¬ 
tions by Monte Carlo simulation, involves learning the optimal port¬ 
folio selection strategy on simulated Monte Carlo data. It shares with 
the LS framework intuitivity, simplicity and flexibility. 
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1 Introduction 


As realized in the pioneering work of Black, Scholes, Merton and others, fi¬ 
nancial assets in complete markets can be priced uniquely by construction of 
replicating portfolios and application of the no arbitrage principle. This con¬ 
ceptual framework forms the basis of much of the currently used methodology 
for financial engineering. In recent years, however, finance practitioners have 
been increasingly led by competitive pressures to the use of much more gen¬ 
eral incomplete market models, such as those driven by noise with stochastic 
volatility, jumps or general Levy processes. In incomplete markets, matters 
are much more complicated, and the pricing and hedging of financial assets 
depends on the risk preferences of the investor. 

Utility based portfolio theory provides a coherent, general and economi¬ 
cally sound approach to risk-management in general financial models. This 
theory is built on the principle that market agents invest rationally by seek¬ 
ing to maximize their expected utility over some time period, where their 
utility function encodes the “happiness” they derive in holding a given level 
of wealth. Key works in this program are those of |H|, |T!J, |2Uj]. The culmi¬ 
nation of these results is a body of theory which give necessary and sufficient 
conditions for existence and uniqueness of optimal portfolios in a broad range 
of contexts. 

Utility based pricing and hedging are extensions growing naturally out 
of portfolio optimization, and much work is now in progress to place these 
methods in the broadest context, and to explore their various ramifications. 
The basic problem is that of a rational agent who seeks to find their opti¬ 
mal hedging portfolio when they have sold (or bought) a contingent claim. 
This framework leads to new concepts, notably the Davis price |J and the 
indifference price of the contingent claim |L6] . 


This much more general theory is naturally applicable in areas such as 
insurance where the complete market theory appears inappropriate pH]. In 


this context, the indifference price can be thought of as the reservation price 
of the claim, that is the amount the insurer should set aside to deal with its 
future liability. 

Practical implcmention of incomplete market models based on these new 
theoretical developments requires the development of efficient numerical meth¬ 
ods. Three distinct approaches can be considered and ultimately all three 
are needed for a complete understanding of implementation issues. One ap¬ 
proach is the numerical solution of general Hamilton-Jacobi-Bellman equa- 
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tions, which are the partial differential equations derived from stochastic con¬ 
trol theory. A second approach could be broadly classified as “state space 
discretization 1 ', by which we mean tree and lattice based methods. A third 
broad approach can be called Monte Carlo or random simulation based meth¬ 
ods. It is this third approach we attempt to realize in the present paper. 

To our knowledge, Monte Carlo methods, although widely used for pricing 
derivatives 0, have not been extensively used for optimal portfolio theory. 
Some works related to this in the context of complete markets are |H| and [7]. 
Our proposed application of Monte Carlo is intrinsically more difficult than 
for example its use in the pricing of American style options, a problem which 
has only quite recently been efficiently implemented with the least squares 
algorithm of [2T|. Despite these difficulties, which we will see quite clearly 


in this paper, Monte Carlo methods have a great asset in being very simple 
and intuitive. By implementing such methods, we can gain key intuition and 
understanding which may be quite difficult to learn from the abstract theory. 

The paper is organized as follows. Section 2 provides the reader with a 
rather detailed survey of the current theory of optimal portfolios. We give 
careful statements of the main results concerning the existence and unique¬ 
ness of optimal solutions for Merton’s problem. We also review the frame¬ 
work of utility based hedging, introducing the key concepts and the basic 
existence/uniqueness results. The special case of exponential utility is dis¬ 
cussed in some detail, because it has the important property that optimal 
solutions are independent of the level of wealth. This property has an im¬ 
portant implication for our proposed Monte Carlo algorithm. 

Section 3 focuses on the dynamics of portfolio optimization, in particular, 
the principle of dynamic programming. The concepts of certainty equivalent 
value, indifference price and the Davis price are introduced. The example of 
the geometric Brownian motion market is worked out in some detail. Sec¬ 
tion 4 specializes to the discrete time hedging framework and gives explicit 
formulas for dynamic programming. 

The main innovation of the paper is the exponential utility algorithm 
given in section 5. It is a Monte Carlo method for learning the optimal trad¬ 
ing strategy for the class of discrete time hedging problems introduced in sec¬ 
tion 4. This algorithm is inspired by the least-squares algorithm of Longstaff 
and Schwartz for pricing American options. Interestingly, our method works 
well only for the expopnenial utility, and no simple extension suggests it¬ 
self for general utility functions. Section 6 describes our first application of 
the algorithm to hedging in a one-dimensional geometric Brownian motion 
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model. We focus on this exactly solvable model in order to have explicit 
formulas with which to compare our Monte Carlo simulation. While the 
hedging strategies learned by the algorithm are somewhat crude, we find 
that the computed indifference prices are quite accurate. In our concluding 
section 7, we discuss the various advantages and drawbacks we see in the 
method. 


2 Utility based hedging for semimartingale 
markets 


The hedging problem is the problem of a market agent who faces a liability 
B at a time T and must invest in the market over the period [0, T] in an 
efficient, rational or otherwise optimal way to reduce the risk of the liability. 
The randomness of the market is represented by a filtered probability space 
(Q, F, (J r t )t£[o,T], P) satisfying the “usual conditions” of right continuity and 
completeness and we assume for simplicity that F = Ft- The discounted 
prices of tradeable assets in the market are given by the WL d -valued cadlag 
semimartingale St = (S ),..., Sf) on the filtration (Ft)- The liability B is 
assumed to be an jT T -measurable random variable. 

A portfolio process, or a trading strategy, is an M d -valued predictable 
S'-integrable process H t = (H) ,..., Hf), which represents the agent’s asset 
allocations, that is, how many units of each traded asset are held by the 
agent at each time t. The class of such processes is denoted by L(S ) ^3[. We 
assume that the portfolio is self-financing (i.e. the changes in its discounted 
market value are solely due to the random changes in the prices of the traded 
assets) so that the agent’s discounted wealth at each time t is given by the 
process 

X t = x + (H- S) t := x + [ H u dS u , t e [0, T], 


where s6lis some deterministic initial wealth. 

To rule out strategies for which the wealth assumes arbitrarily negative 
values (such as “doubling strategies”), we need to assume some admissibility 
condition on the possible portfolio processes. Following |L5|], we say that 


Definition 2.1 The class TL of admissible portfolios consists of the process 
H e L(S ) for which (H ■ S) t is P-a.s. uniformly bounded from below. 
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More explicitly, H is admissible if there exists a constant k > 0 (possibly 
depending on H , but neither on t nor on u) such that 


(H • S) t (u) > -k, 

for almost all to £ and all t € [0, T ]. 

As a first consequence of this notion of admissibility, we have the following 
useful result concerning the closedness of the class of local martingales under 
stochastic integration Q theorem 2.9]: 

Lemma 2.2 If S is a local martingale and H is an admissible integrand for 
S, then (H ■ S) is a local martingale. Consequently, (H ■ S) is a supermartin¬ 
gale. 

Regarding martingale measures, we adopt the following definition. 


Definition 2.3 A probability measure Q is called an absolutely continuous 
(resp. equivalent) local martingale measure for S if Q -C P (resp. Q rs./ P) 
and S is a local martingale under Q. 


We denote the set of absolutely continuous (resp. equivalent) local mar¬ 
tingale measures for the price process S by Ai a (S ) (resp. by A4 e (S)). Ob¬ 
serve that, due to lemma a probability measure 0 <C P (resp. Q ~ P) is 
an absolutely continuous (resp. equivalent) local martingale measure if and 
only if (H ■ S') is a local martingale under Q for any H £ Tt. 

To ensure a viable market, free of arbitrage, we assume the technical con¬ 
dition “No Free Lunch with Vanishing Risk” (NFLVR), which is slightly more 
general than “No Arbitrage” (NA). The reader is referred to [TT, sections 2 
and 3] for the precise definition of these notions, as well as the relations be¬ 
tween them. In its most general form |12|], the fundamental theorem of asset 
pricing (FTAP) asserts the equivalence between (NFLVR) and the existence 
of an equivalent cr-martingale measure for the price process S 0, which 
might fail to be in Ai e (S) if we allow S to have unbounded unpredictable 
jumps. The technicality of using cr-martingales can be avoided, however, if 
we restrict ourselves to price processes S which are locally bounded. By that 
we mean that there exists a localizing sequence of stopping times {T n } such 
that, for each n, the stopped processes S Tn are bounded. In this context, we 
have 13 corollary 1.2]: 
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Theorem 2.4 (FTAP) If S is a locally bounded semimartingale, then there 
exists an equivalent local martingale measure Q for S if and only if S satisfies 
(NFLVR). 

In view of this theorem, we will henceforth assume that S is locally 
bounded and that 


Assumption 1 (NFLVR) Af e (S) 0. 

The hedging problem can be made specific by introducing the agent’s 
utility U : R —» 1 U {-oo}, a concave, strictly increasing, differentiable 
function. Beginning with initial capital x G R, the agent then solves the 
optimal hedging problem 

sup E[U(x + (H ■ S) T — B )]. (1) 

H&n 

If B = 0, the optimal hedging problem reduces to Merton’s optimal 
investment problem 

sup E [U (x + (H ■ S) T )] ■ (2) 

H&H 

To assert the existence and uniqueness of solutions to problems of the 
form in incomplete markets, one first needs to impose further technical 
restrictions on the class of utility functions. In the next assumption we 
summarize the main properties required to hold throughout this paper. They 
include the “reasonable asymptotic elasticity” condition as defined in ||26| , 
definition 1.5]. 


Assumption 2 The utility function U : R —> 1U {—oo} is increasing on R, 
continuous on {U > — oo}, differentiable and strictly concave on dom{U ) = 
int\U > — oo}, satisfying 

lim U\x) = 0. (3) 


Furthermore, we assume that one of the following cases hold. 

xU^ (^) 

Case 1: dom{U) = (0, oo), with lim U\x ) = oo and lim sup < 1. 

X ^0 X — KX) U\Xj 

Case 2: dom(U) = R, with lim U'(x) = oo, 

x —>— OO 


xU'(x) . xU'(x ) 

lim sup < 1 and hmmf > 1. 

2,^00 U ( X) x^-oo U ( x) 
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The central technical weaponry used to address the general solution to 
problem @ is convex duality, by means of which the utility maximization 
problem over admissible portfolios (the “primal problem”) is related to a 
minimization problem over a suitable domain in the set of measures on 
(the “dual problem”). The first step is to define the conjugate function V as 
the Legendre transform of the function —U(—x), that is 

V(y) :=swp[U(x) - xy\, y > 0. (4) 


It follows from well known results in convex analysis |24|] , that the function 
V has the properties listed below. 


Proposition 2.5 IfU satisfies assumption [|, then the conjugate function V 
is finite valued, differentiable, strictly convex on (0, oo) and satisfies 


lim V(y) = lim U(x), \imV\y) = — oo. 

y —>0 x —>oo y—► 0 


(5) 


Moreover, the behaviour of V at infinity is determined by the two cases in 
assumption [| as follows: 

Case 1: lim V(y) = lim IJ(x) and lim V'(y) = 0. 

y —>-oo x —>0 y —>oo 

Case 2: lim V(y) — oo and lim V'(y) = oo. 

y — kxd y—> oo 


Both the primal and dual problems are solved over different domains 
depending on which of the two cases above we are dealing with. We start 
with the first case, for which the present state-of-the-art solution can be 
found in [20]. Since in this case the utility function is only defined for positive 


wealths, we need to consider the set 


X(x) = {X > 0 : X t = x + (H ■ S) t , for some H G L(S), 0 <t<T}. (6) 

It is clear that x+(H-S)t > 0 implies that the portfolio H must be admissible, 
that is 

X{x) C {X t = x + (H ■ S) t ,H EH,t G [0, T]} 

with a strict inclusion. Next we move from the set of processes X (x) to the 
set of positive random variables 

C(x) — {g G IFt, P) : g < Xt, for some X G X(x)} (7) 
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and observe that, since the utility function is increasing, the primal problem 
for case 1 written in the form 


sup E[U(X t )\. 
xex(x) 


( 8 ) 


is equivalent to 


u(x) — sup E[U(g)]. 

g£C(x) 


(9) 


At this point, in order to exclude trivial cases, we make the following 
assumption. 

Assumption 3 The value function u defined in satisfies u(x) < oo, for 
some x > 0. 

As for the domain of the dual problem, one looks for a set with the 
property of being in a “polar relation” with the set C (the reader is refered 
to f| for the definition of the polar of a subset of L°(0, T, P)). In the 
mathematical finance literature 0 0- the sets M e (S ) and M a (S ) were 
considered. One of the main technical novelties in (2(| was to enlarge this 
domain in order to obtain a set D in “perfect” polar relation with C (see |20| . 
proposition 3.1]). The set D turns out to be the convex, solid hull of A4 a (S) in 
L+(Q, IF, P) (topologized by convergence in measure). Amongst the several 
equivalent characterizations of the set D, we single out the following |25 


D = {hr 6 L q + (FL,E t ,P) : there exists a sequence 
(Q n )™=i £ M a (S ) such that Y T < ( a.s .) lim 


dQ n 

~dP 


( 10 ) 


For y > 0, let us define D(y ) = yD. The dual problem for case 1 can now 
be formulated as 

«)= mf W4 (ii) 

The next theorem states the existence and uniqueness of solution for the 
problems ([]) and ([II]) for utilities restricted to positive wealth (^0j. theorem 
2 . 2 ], 


Theorem 2.6 Suppose that assumptions 0.1 (case 1) and [| are satisfied. 
Then, for any x G dom(U) and y > 0, the problems 

u(x)= sup E[U(X t )\, v{y)= inf E[V(Y T )] (12) 

X T eC (x) Y T eD(y) 
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( 13 ) 


have unique optimizers Xr{x) G C{x) and Yr(y) G D(y) satisfying 

U\X T (x)) = ? T (y ), 
where x and y are related by u'(x) = y. 

We note that this theorem and its proof apply unchanged if we modify 
case 1 of assumption to allow for utilities defined on an interval of the form 
(a, oo), for any a G M, provided we impose that limC/^a:) = oo. Observe also 

x—>a 

that the optimizer X T {;x) can be uniquely expressed as 

X T (x) = x + (H(x) ■ S) T , 

for H{x) G hi, whereas the optimizer Y T {y ), even for the cases where Y T (y)/y 
is not the density of an absolutely continuous martingale measure (by having 
its total P-mass strictly less than 1), can be arbitrarily approximated by 
elements in A4 a (S) (in the sense of almost sure convergence). 

For utility functions defined on the entire real line the problem is more 
involved, due to the fact that the class of admissible portfolios as in definition 
|2.2| turns out to be too narrow to contain the optimal solution. One approach 
is to start with the dual problem, for which || shows that an optimal solu¬ 
tion always exists (under very general conditions). Then the set of allowed 
portfolios can be characterized in terms of it. This opens up a plethora of def¬ 
initions of “allowed” portfolios. The reader interested in this line of thought 
is referred to 0.0: where the exponential utility is addressed, and to |57j] 
for more general utility functions. 

A more direct idea is to concentrate on random variables which do not 
necessarily arise as terminal values of wealth processes for any portfolios, 
but which can be arbitrarily approximated by such objects. Different such 
domains of optimization over random variables have been proposed |26], [L4 


the difference being the kind of topology (convergence) adopted to describe 
the approximation mentioned above. In what follows, we adopt the approach 
proposed in [|26|, and specialize later on to the case of exponential utility 
U{x) = —, where sharper results can be quoted. 

We denote by Cjy(x) the class of random variables which have integrable 
utility and can be dominated by the terminal wealth of admissible portfolios, 
that is, 


C b u{x) = {geL\n,X T ,P) : g < x + (H • S) T 

for some H G hi and U(g) G L 1 (h2, Xt, -P)}. 


(14) 
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Next consider the closure of the set {U(g) : g G Cfj(x)} in the topology 
of L 1 (h2, T t , -P). Putting U(oo) := lim U(x), we see that the the utility 

x —>oo 

function is a bijection between R and R, if U(oo) = oo, and a bijection 
between R U { 00 } and (— 00 , [/(oo)] otherwise. We can therefore write a 
general element in this closure as U (/) for some / G L°(Q, T t , P; R U { 00 }). 
The set of such random variables is denoted by Cjj(x), that is, 

Cu(x ) = {/ G Tti P; R U { 00 }) : U(f) is in the 

L 1 (P)-closure of {U(g) : g G C'^(x)}} . (15) 

The primal optimization problem then becomes 

u(x) = sup E[U(f)]. (16) 

feCu(x) 

As in case 1, to exclude trivial cases we make the following assumption. 

Assumption 4 The value function u defined in satisfies u(x) < U( 00 ), 
for some x G R. 

Complicated as the domain Cu{x ) might seem, the good news is that in 
this setting the optimization domain for the dual problem is simply Al a (S'), 
as opposed to the enlarged set D of case 1. That is, the dual problem is now 


v{y) 


inf E 

Q£M a (S) 



We can now state a theorem for case 2 [|26], theorem 2.2], 


(17) 


Theorem 2.7 Suppose that assumptions 0.1 (case 2) and £| are satisfied. 
Then: 


1. For any igR and y > 0, the problems 


u(x)= sup E[U{f)\, v(y)= inf E 

f£Cu(x) QeM a (S) 


V y 


dQ 

'Ip 


(18) 


have unique optimizers f(x) G Cjjix ) and Q(y) G A4 a (S) satisfying 

dQ(y ) 


U'{f(x)) = y- 
where x and y are related by u'{x) = y. 


dP ’ 


(19) 
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2. If it occurs that Q(y ) G Ai e (S), then f(x) equals the terminal value 
Xt(x) of a uniformly integrable Q(y)-martingale of the form 

X t (x)= x + (H{x)-S) t , t G [0,T], 
for some H(x) G L(S). 

Observe that the optimizer f{x) G Cu{x) does not need to be the ter¬ 
minal wealth of any portfolio. However, by construction of the set Cjj(x ) 1 
its utility can be arbitrarily approximated by the utility of terminal wealth 
of admissible portfolios. As for the optimizer of the dual problem, recall 
from proposition |2]5| that lim V(y) = U( oo). Therefore for all cases when 

3/->0 

[/(oo) = oo, the minimizer must satisfy dC ^ > 0 almost surely, implying 
that Q(y) G M e (S) and item 2 holds. In such cases, f(x) itself can be 
achieved by trading according to a portfolio H G L(S). Although H might 
not be in H, the wealth process generated by it, being a uniformly integrable 
Q(y) martingale, certainly does not arise from a “doubling strategy”, so that 
H can be considered a posteriori to be an “allowed” portfolio. Turning this 
argument around was the starting point of the aforementioned approaches to 
extend the domain of the primal problem to include such portfolios |10|, [17], [27j 
But the minimizer Q(y) is also equivalent to P in other cases, and it is 
here that we specialize to an exponential utility of the form U(x) = — , 

7 > 0. Observe that for this utility we have 


lim sup 

XX) 


xU'(x) 

U{x) 


—oo < 1 


and 

. xU'(x) 

Inn ml = oo > 1, 

x^-oo U (x) 

so that it satisfies all the conditions for case 2 of assumption [2|. Observe 
further that its dual function is 

V(y) = -(log y - 1), 

7 

so that the dual problem ([l7i) is equivalent to the problem of finding a mea¬ 
sure in M a (S) with minimal relative entropy with respect to P, that is, 


inf E 

Q£M a (S) 




( 20 ) 
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It follows from [6] that the minimizer of this problem (which incidentally is 
independent of y ) will be an equivalent local martingale measure provided 
there exists at least one measure in J\A e (S) with finite relative entropy, al¬ 
lowing us to use item 2 of theorem |2.7|. 


Corollary 2.8 Let U(x ) = 7 > 0, and suppose that assumptions [lj 

and hold. If in addition we have that 


E 




< 00 , 


( 21 ) 


for some Q G .M e (S ') 7 then the minimizer Q(y ) of theorem |i T7\ is the equiva 


lent local martingale measure Q, independent of y > 0, which minimizes the 
relative entropy with respect to P among all absolutely continuous martingale 
measures. Therefore f(x) equals the terminal value X T {x ) of a uniformly 
integrable Q-martingale of the form 

X t (x) = x + ( H(x ) • S) t , 


for some H{x) G L(S). 


We now move to the subject of solving the hedging problem ([[]). Once 
more the solutions will take place in different domains and involve different 
techniques depending on whether our utility function falls into case 1 or case 
2 of assumption |2|. In either case, we are going to assume that the random 
claim that we want to hedge is a bounded random variable. 


Assumption 5 B G L°°(r2, Et, P)- 

We start with the first case, which was solved in ||. Observe that to 
account for the presence of a random claim at time T, it is not enough to 
consider positive random variables which are dominated by terminal values 
of admissible portfolios, as was done in (0). We therefore consider the set 

C{x) = {ge L°(Q, Ft, P) : g < x + (H ■ S) T , for some H G H}. (22) 

The primal problem now becomes 

u(x) — sup E[U(g — B)\, (23) 

g£C(x) 

where it is understood that U(x) = —00 whenever x < 0. 

As in the previous cases, we assume the following. 


12 






Assumption 6 The value function u defined in ( $%{ ) satisfies \u(x)\ < oo 
for some x > ||-B||oo- 


Recall that the crucial point in the proof of theorem |2l| was the use of 
the polar relation between the sets C and D as subsets of Z/j_(Q, Ft, P), for 
which a version of the bipolar theorem can be used J4|. In the absence of 
such results for subsets of L°(fi, JF T , P ) as a whole, we are led to consider an 
appropriate subset of L°°(P), namely 

C = C(0)nL°°(n,P T ,P). (24) 

Accordingly, to obtain a perfect polar relation, we need to modify the 
definition for the domain of the dual problem. The natural space to define 
the polar of a subset of L°° is its topological dual (L°°)*. We therefore define 


V = {Qe (.L°°)* : HQII = 1 and Q(g) < 0 for all g G C}. (25) 

To obtain a more concrete characterization of this set, notice that T> G j_ 
(since C contains all the negative bounded random variables). The good news 
about the set (L°°)+ is that it can be identified with the set af all nonnegative 
finitely additive bounded set functions on Tt which vanish on the P-null sets. 
Moreover, any such function Q G (L°°)^ can be uniquely decomposed into 
its regular part Q r and its singular part Q s as follows 

Q = Q r + Q s , 


where Q r > 0 is countably additive and Q s > 0 is purely finitely additive. 
Naturally, Q r corresponds to a measure which is absolutely continuous with 
respect to P and whose Radon-Nikodym derivative is denoted by We 
now look at the subset of regular elements in T>, namely 

V r = {Q G V : Q s = 0} = VnL\fl,P T ,P). (26) 


Since all elements in T> have unit norm, it follows that T> r consists of probabil¬ 
ity measures which are absolutely continuous with respect to P. In fact, since 
we are assuming that the processes S are locally bounded, it can be shown 
that V r is nothing but our familiar M. a (S), the set of absolutely continuous 
local martingale measures for S [^]. lemma 1.1 (b)]. The dual problem in this 
case is 


v(y) 


inf | E 

Q&V [ 



dQ r R 

— y——B 
u dP 


yQ s {B) , 


(27) 
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where one should notice that the domain of optimization is the entire T>. 
with the dual function V contributing to it only through its regular subset 
T> r , whereas the dependence on the claim B is manifested on both its regular 
and singular parts. In this respect, it is worth mentioning that our old set 
D (for Merton’s problem) can also be characterized as the regular part of 
the weak-star closure of the convex solid hull of A4 a (S) in (whose 

elements can have total P-mass strictly less than 1). From this perspective, 
it becomes clear that the extra care necessary to treat the hedging problem 
in this case comes from dealing with both the regular and singular parts of 
elements in the domain of the dual problem. The main result in this case is 
|| theorem 3.1] 

Theorem 2.9 Suppose that assumptions [7], (case l),^an are satisfied. 

Let x 0 = sup Q(B). Then, for any y > 0, the dual problem 
Q&v 


v{y) 


inf { E 

q&v y 



dQ r _ 

— V - B 

U dP 




(28) 


has a unique (up to singular part) optimizer Q(y) G D and, for any x > xq, 
the primal problem 


u(x) — sup E[U(X t — B)\ 
x T eC(x) 


(29) 


has unique optimizer Xt(x) G C(x) satisfying 


U'(X T (x) -B) = y 


dQ r (y ) 

dP ' 


(30) 


where x and y are related by u\x) = y. 

Regarding the second case of assumption the optimal hedging problem 


has been solved in [10 


for the exponential utility and claims B satisfying 
a boundedness conditions weaker than assumption |5|. In [|2^], the problem 


was solved for general utility functions with reasonable asymptotic elasticity 
(which include the exponential) but bounded claims (although some remarks 
are offered on how to extend the result to possibly unbounded ones). We 
describe here the solution of since it follows the same techniques of 
26j and p|, for which we have already developed most of the notation. In 
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the presence of a claim satisfying assumption |5|, the analogue of the set Cy 
defined in ([L4|) is 

C b u(x) = {g£L°(n,F T ,P):g<x+(H-S) T -B 

for some H G Tt and U(g) G L 1 (0, P T , P)}. (31) 

Similarly, we replace the set Cy(x) by 

C v (x) = {f E L°(n,P T ,P;RU{oo}) : U(f - B) is in the 

L 1 (P)-closure of { U(g) : g G Cy (x)} } . (32) 

The interpretation of this set is the same as before, only this time we have 
to account for the random claim B. Namely, it consists of random variables 
which, after subtracting the claim B, have a utility that can be arbitrarily 
approximated by the utility of terminal wealth of admissible portfolios less 
the claim B. 

Our modified primal problem now reads 

u(x) = sup E[U(f — B)\, (33) 

fec v {x) 

for which we assume the following. 


Assumption 7 The value function u defined in (|35[) satisfies u(x) < U(oo), 
for some x G M. 

As with the case of no claim, when we pass to utilities defined on the 
entire real line the domain of the dual problem becomes simpler, being just 
the set M a (S ) (as opposed to the complicated set V). In the same vein, 
the statement of the dual problem is much more transparent, since it does 
not involve the singular measures that we encountered before. It is simply 
(compare with (|27|)) 


v{y) 


inf E 

QeM a (S) 



dQ ' 
v-—B 
u dP 


(34) 


The next theorem p2|, theorem 1.1] provides the existence and uniqueness 
of solutions to the hedging problem for utilities defined on the entire M. The 
remark following theorem |2.7| about the optimal measure Q(y) being actually 
equivalent to P when U(oo) = oo applies here as well (as can be seen from 
the form of the dual problem (|34D). 
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Theorem 2.10 Suppose that assumptions^. [| (case 2), an are satisfied. 
Then: 


1. For any and y > 0, the problems 

u(x) = sup E[U(f — B)\, v(y) = inf E 

feCu(x) QeM a (S) 


dQ\ dQ 
V y — — y-^B 

' y dP ) y dP 


have unique optimizers f{x) G Cu{x) and Q(y) € A4 a (S) satisfying 

dQ(y) 


U'(f(x ) ~B)=y- 

where x and y are related by u'(x) = y. 


dP 


(35) 


2. If it occurs that Q{y ) G Xi e (S), then f(x) equals the terminal value 
X T {x) of a uniformly integrable Q(y) -martingale of the form 

X t (x) = x + (H(x) ■ S ) t , 

for some H(x) G L(S). 

To assert that the optimal measure Q(y) is actually equivalent to P for 


the case of exponential utility, the analogue of proposition |2.8|, we follow |1C 


and consider the change from P to an equivalent probability measure Pb 
with density 

dP 

B = c s e 7S , with c~ B l = E[e^ B }. (36) 


dP 

Therefore, for any Q -C P, we have that 
E 


dQ dQ 


dQ 

dQ 


dQ 

dP ° g dP 

- e p b 

> 

_ 1 

log 

^dP B \ 

+ logcs + E 

[Up \ 


(37) 


It then follows from the boundedness of B that Q has finite relative entropy 
with respect to P if and only if it has finite relative entropy with respect to 
Pb- 

Now notice that the dual problem in this case is 


viy) = inf E 

Q&M a {S) 

y, 


-Ft (log (yll 
_7 dP \ S y dP 


- 1] - y dp B 


= —(log y — 1) + inf E 

7 Q&M a {S) 


dQ dQ dQ 
— log- 7 —B 

dP & dP Up 


= -(logycs-l)T- inf E Pb 

7 7 QeM a (S) 


dQ dQ 

log 


dP, 


B 


dP, 


B 


(38) 
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from which we see that its minimizer coincides with the minimizer of the 
relative entropy with respect to Pb over all the absolutely continuous mar¬ 
tingale measures for S. But from the argument preceding corollary |2.8| , such 
a minimizer is equivalent to Pb (and therefore to P) provided there is at 
least one Q in A i e (S) whose relative entropy with respect to Pb is finite, 
which in turn is the same as having at least one Q in A4 e (S) whose relative 
entropy with respect to P is finite. This suffices to prove: 


Corollary 2.11 Let U(x) = 7 > 0, and suppose that assumptions [J, 

[| and [9| hold. If in addition we have that 


E 




< 00, 


(39) 


for some Q G AT(S'), then the minimizer Q(y ) of theorem \2. 1 (\ is the equiv¬ 
alent local martingale measure Q, independent of y > 0, which minimizes the 
relative entropy with respect to Pb among all absolutely continuous martin¬ 
gale measures. Therefore f (x) equals the terminal value Xt(x) of a uniformly 
integrable Q-martingale of the form 


XAx) = 


= x 


(H(x) • 5) 




for some H{x) G L(S). 

We end this review section with a discussion about complete markets, de¬ 
fined to be those for which there is exactly one equivalent martingale measure 
Q, that is, A d e (S) is the singleton {Q}. The second fundamental theorem of 
asset pricing relates this definition with the existence of a replicating portfolio 
for each bounded Ar-nieasurable random variable. 


Theorem 2.12 (FTAP II) Suppose that assumption [I] holds. Then the fol¬ 
lowing are equivalent: 

1. The market is complete (i.e. A4 e (S ) = {Q})- 

2. For each X G L°°(Ll, IFt, P) there exist a unique admissible portfolio 
H G Tt and a constant iGl such that 


X = x + (H ■ S) T . 


(40) 
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For complete markets, Merton’s problem can be solved almost explicitly 
in terms of jp. The results of the next two theorems, which are slightly 
stronger versions of |2(], theorem 2.0] and [^, theorem 2.1] (since we are 
assuming reasonable asymptotic elasticity for all our utility functions), are 
the analogues of theorems |2.6| and |2.7| for complete markets. Notice that 
for either case 1 or case 2 in assumption [2|, the value function for the dual 
problem is reduced to 


v(y ) = E 


V y 


dQ 

'dP 


y > o 


(41) 


(for case 1 this was proved in [|(], lemma 4.3]; for case 2 it is trivial, since 
M e (S) = {Q} implies that M a (S ) = {Q} as well). 


Theorem 2.13 Suppose that AT(S') = {Q} and assumptions [| (case 1) and 
[| hold. Then, for any x G dom(U ) , the problem 

u(x) = sup E[U(X t )\ (42) 

X T £C(x) 

has a unique optimizer Xt(x) G C(x) given by 

XrO) = -V (y§) , (43) 


where y is the solution to the equation 


E 



dQ 

Ip 


= x. 


(44) 


Theorem 2.14 Suppose that A4 e (S) = {Q} and assumptions [| (case 2) and 
[| hold. Then, for any iGl, the problem 

u(x) = sup E[U(f)\, (45) 

fec v ( x) 

has a unique optimizer f(x) G Cjj{x) given by 

fix) = -v L'ffTj , (46) 
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where y is the solution to the equation 


E 



dQ 

dP 


= x. 


(47) 


Moreover, f(x) equals the terminal value X T (x ) of a uniformly integrable 
Q -martingale of the form 

X t (x)=x + {H(x)-S) t , t E [0,T], 
for some H(x) E L(S). 

There is no need to state versions of theorems |2.9| and 
complete markets the solution to the hedging problem ([[]) for a bounded 
claim B can be expressed in terms of the solution of Merton’s problem. 
Indeed, by theorem |2.12| , there exists ( B 0 ,H b ) such that 

B = B 0 + (H b ■ S) T 


2 .10, since for 


and this can now be used to write ([!]) in the form of the Merton problem 
sup E [U (x - B 0 + ((H - H b ) • S) T )} ■ 

Han 


Therefore, if H°(x — B 0 ) is the optimal portfolio for the Merton problem 
starting with wealth x — B 0 obtained either from theorem |2.13| or from the¬ 
orem |2.14|, then the optimal portfolio for the hedging problem for the claim 


B starting with wealth x will be given by 


H(x) = H°(x + B 0 ) + H 


B 


(48) 


3 The dynamics of portfolio selection 

The theorems of the previous section give precise statements ensuring the 
existence and uniqueness of solutions for both the optimal investment and 
optimal hedging problems for different types of utility functions. We have 
seen that under well defined conditions, there is a clear sense in which the 
optimal solution can always be approximated arbitrarily well by trading ac¬ 
cording to admissible portfolios. In what follows, to adopt a unified notation, 
we will write H E A, which loosely stands for “allowed” portfolios. In the 


19 










back of our minds, however, we will keep the rigorous notion of what it stands 
for: admissible portfolios which, starting with initial capital 16 R, generate 
terminal wealths in C(x) and C(x) for theorems |2.6| and [2.9| , respectively, 
or terminal wealths whose utilities arbitrarily approximate the utility of the 
optimal solutions (in the L 1 sense) f(x) for theorems |2d] and [2.10| . We also 
use the notation A( s ,t] for portfolio processes defined only on the time interval 
(s, t], as well as the shorthand for stochastic integration in this interval 


(.H ■ Sf s := / H u dS u , 0 <s<t<T. 


Consistently with our previous section we have that A = A(q t\ and ( H-S) t = 

(H ■ sy a , t e [0,71. 

To understand better the optimal selection problem it is useful to formu¬ 
late a dynamical version of it. Let us write for the optimal solution to 

the static primal problem 


u(x) = sup E\U(x + (H ■ S)t — B )], (49) 

HeA 


obtained according to the theorems of the previous section, that is, starting 
at time 0 with initial wealth x. For any intermediate time t e [0, T] and 
x G dom(?7), we can write 


u(x) = sup E[U(x + (H ■ S)t — B )] 
HeA 


= sup E 


sup E t [U{x + (H-Sy 0 + (H-S)J-B)] 
H&A( t ,T] 


, (50) 


which leads us to the study of the conditional problem 


u t (vj) = sup E t [U(\N + (H • S)J - B)}, (51) 


where w G M represents the wealth accumulated up to time t. If we trade 
according to H up to time t, that is, if w = x+ (L/W 0 ) • S) t} then we must 
have 

ut(w) = E t [U {w + (id (w ^ • S)J - B)], (52) 

for some portfolio G A(t,T] (that is, starting at time t with wealth 

w) which agrees with the restriction of H on the interval (t, T]. In other 


20 







words, the optimal portfolio H ( x, °) is also conditionally optimal. This is a spe¬ 
cial instance of the dynamic programming principle, which for this stochastic 
control problem has the form 

u s (w) = sup E s [u t (\N + (H ■ S)l)}, (53) 

for 0 < s < t < T . 

The certainty equivalent value and the indifference price 

There is a useful way to view the value function m*(w). By the intermedi¬ 
ate value theorem, f/ _1 (w t (w)) exists for each (w, t), P-almost surely. This 
defines, for each (w, f), the random variable 

B t ( w) = w - P _1 (tq(w)), (54) 

which can be called the certainty equivalent value of the claim B at time t. 
Since 

U (w - B t ( w)) = E t [U (w + ( H • S)J - B)], 

the certain utility achieved by investing the amount w — P 4 (w) in the risk free 
account equals the expected utility of the terminal wealth w + (fp w ’d • S)J — 
B of the optimal hedging portfolio. For Merton’s problem, where B = 0, 
the amount —P f °(w) indicates by how much the optimally invested portfolio 
outperforms the constant portfolio w over the period (f, T], By putting s = 0 
in Q53p we obtain 

u(x) = sup E[ut(x + (.H ■ S^q] 

= sup E[U[x + (H-Sy 0 -B t (x + (H-Sy 0 )]. (55) 

H£A(o >t ] 

Therefore, B t ( w) represents a wealth dependent effective value of the claim 
B at time t. 

Following |T6| (according to [|lj), a clear interpretation of the certainty 
equivalent values can be given by considering an investor who, holding wealth 
w at time t, must decide the minimum amount 7r to charge when selling a 
claim B. If he sells the claim for 7r and hedges optimally against the claim 
by holding the portfolio //( w + 7r > t ) ? h e will achieve an expected utility 

E t [U{w + 7T + ■ S)J - B)\ = U(w + 71 - B t (\N + 7r)) 
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If, however, he does not sell the claim and invests optimally for Merton’s 
problem, he achieves 

E t [U{\N + (H^\ 0) • S)f)} = U (w - B°( w)). 

The indifference price of the claim i? at time t for wealth w is the value for 
7 r = 7r®(w) which makes these equal, that is, it is the solution of 

nf (w) = B t (w + 7rf (w)) — L?°(w). (56) 

Since we have defined these concepts from the point of view of an agent faced 
with a liability B at time t, this indifference price corresponds to a “seller’s 
price”. To obtain the correct notion of a “buyer’s price”, we just need to 
consider the reverse claim —B, which then produces a terminal wealth with 
expected utility equaling that of w — 7r — B t (yj — n) when bought by 7r. The 
indifference price is now the value of 7r that makes this equal to the amount 
whose certain utility equals the expected utility for Merton’s problem starting 
with wealth w at time t, which by definition is w — B^(\n). In other words, 
it is the solution of 

7T = B°(\n) - B t {yj - 7r), (57) 

which therefore equals —7if(vj) as defined in (j56|). 

In a complete market, the indifference price equals the risk-neutral price, 
that is, if the bounded claim B is written in terms of its unique replicating 
admissible portfolio H B as B = Bq + (H B ■ S)t, then 

JT? = S„ + (H b ■ S )‘ 0 = E t , Q [B], (58) 

where Q is the unique equivalent martingale measure. The first equality 
above remains true in incomplete markets if the claim B happens to satisfy 
B = Bq + ( H b ■ S)t for some admissible portfolio H B . 

The Davis price 


Let us assume for a moment that the solutions of the dual problems in 
are equivalent martingale measures (in case 2 we have 


theorems 2.6 and 2/ 


seen that this indeed the case for the exponential utility under the finite 
entropy condition; for counterexamples where in case 1 the solution fails 
to be a martingale, see |2D[). If, for each e > 0, we let Bf( w) denote the 
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certainty equivalent value of eB , then the Davis price of B is defined to be 


_.Davis /,.,\ 

7T, W = 


dB e t { w) 


de 


£—0 


By differentiating the identity 

U( w - B'(w)) = S,[t/(w + (_ff ( '' w '‘> ■ S)J - eB)} 


(59) 


at e = 0 and noting that, by optimality, 

dH^t) 

de 0, 

£=0 

we see that 

( w ) = -F777- Bo7~V\ -' 

But from the theory of the Merton problem, a dynamical version of either 
© or gives 

U \w + (W (0 ’ w ’ t} • S)J) = U \w - B?(w)) ^ 

for y = u[(\n), where <3t(y) stands for the optimal solution to the conditional 
dual problem. Thus the Davis price of B is given by the expectation pricing 


x?-(w )=E tAw {B], (60) 

We remark that the indifference price, being intrinsically nonlinear, does 
not in general satisfy useful criteria such as put-call parity. The Davis price, 
on the other hand, does. 


Exponential utility 


An important simplification occurs if we specialize to the exponential 
utility U(x ) = — 7 > 0. A look at ( | 5 T| ) shows that tq(w) factorizes as 


u t ( w) 


— 7 W 

- inf E t 

7 HeA (t ,T] 


e - 7 (H-S)f+yB 


e - 7 w 

- 

7 


(61) 
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Here we see that v t is a time dependent but wealth independent ^-measurable 
random variable. We also see that the certainty equivalent value 

B f — —— logt'i, (62) 

7 

the optimal portfolio H® and the indifference price ir t are all wealth inde¬ 
pendent processes. 

Example (geometric Brownian motion): Consider now a market of d 
stocks whose prices, discounted by the constant interest rate r, satisfy 

ai d 

—I = (jj - r)dt + v ia dW a , (63) 

0=1 

where // G M and the invertible d x d matrix a lot are constants and (W a ) is a 
d-dimensional P-Brownian motion. This market is complete and, as is well 
known, the unique equivalent martingale measure Q has Radon-Nikodym 
derivative 


dQ 

dP 


= exp 


-jf (j2x a dW a + 


;I|A|| 2 dt 


(64) 


with constant market price of risk A" = JT(<r -1 ) m (/P — r )- 

For the exponential utility function with initial wealth x, the optimal 
discounted terminal wealth Xt is given by 


- 7 x T dQ 

e =v 1p’ 


(65) 


for a constant y to be determined. From this and (^), one finds 


X T = -- (logy + 1 

7 


dt I + 


7 Jo 


- r) (W) 


-i\o 


dsi 




si ’ 


so the optimal portfolio for Merton’s problem is 

(H°y t = 




7 Si 


( 66 ) 
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and y is the solution to the equation 

* = -i(k,g<,+ i||AfT). (67) 

The certainty equivalent value for Merton’s problem in this market turns out 
to be 

•S? = |l|A|[ 2 (i-r). (68) 

Since the market is complete, the indifference price of any bounded claim 
B equals its risk-neutral price (its Black-Scholes price), so the certainty equiv¬ 
alent value is given by 

B, = i||A|| 2 ((-T)+B t , Q [B]. (69) 

Finally, the optimal hedging portfolio for the claim B is 

H, = H [? + Hf, (70) 

where H t B is the replicating portfolio for B. 

4 Discrete time hedging 

We now restrict to discrete time hedging, where the portfolio processes have 
the form 

K 

Ht = ^2 H t 1 (w ,,(t) (71) 

k= 1 

where each H k is an M d -valued T k -\ random variable. We take the discrete 
time partition of the interval [0, T] to be of the form 

T kT 

t 0 = 0 < t x = — <■■■< t k = -jT-... <t K — T 

and use the notation Sj := S tj for discrete time stochastic processes. The 
discounted wealth process will be Xj = x + (H ■ S)j, with the notation 
(H-S)j := YLi H k*S k , (H-S)l := (H-S)j-(H-S) k and := S k -S k „x. 
Now the dynamic programming problem (^) falls into K subproblems 

u k -i(x) = sup E k -x{u k {x + H k AS k )\, k = K , K - 1,..., 1 (72) 
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subject to the terminal condition Uk(x) = U(x). Then, for each x this 
defines a process u k (x). Similarly, the certainty equivalent value process 
B k (x) is defined iteratively by 

U(x - B k _i(x)) = sup E k _ 1 [U(x + H k AS k - B k {x + H k AS k )\ (73) 

with Bk{x) taken equal to the terminal claim B. In both formulations of the 
problem, let H k (x) denote the minimizer, which of course is an E k -i random 
variable. 

In what follows, we will be largely concerned with markets and claims 
which satisfy the Markovian conditions: 

Assumption 8 The market is Markovian and its state variables 
Z = (S' 1 ,..., S d , Y 1 ,..., Y n ~ d ) lie in a finite dimensional state space S G M n . 

Assumption 9 The contingent claim is taken to be of the form Bt = Q(Zt) 
for a bounded Borel function <f> : S —> M. 

In these assumptions, we interpret S as discounted asset prices as before 
and the additional variables Y as values of nontraded quantities such as 
stochastic volatilities which may or may not be observed directly. In this 
Markovian setting, the solution of (0) and the optimal allocation have the 
form 


u k (x) = g k (x,Z k ) (74) 

H k+1 (x) = h k+1 (x, Z k ) (75) 

for (deterministic) Borel functions {g k , h k+ i }^ 1 mapping dom(77) x5tol 
and W l respectively. Similarly, the solution of (|73|) has the optimal allocation 
H k+ 1 as above and B k in the form 


B k (x) = b k (x, Z k ) (76) 

for Borel functions mapping dom(f/) x S to M. 

As indicated in the previous section, matters simplify in the special case 
of the exponential utility function U(x) = —7 > 0. One finds that the 
dynamic program can be written in the wealth independent form u k (x) = 
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U(x)v k , H k (x) = H k , and B k {x) = B k where the random variables v, H and 
B have the form: 


V k 

— g k {z k ) 

(77) 

k+l 

^fc+i (,Zk) 

(78) 

Bk 

= h(Z k ) 

(79) 


for deterministic functions g k , h k+ 1 , and b k on the state space S. The iteration 
equations are simply 

g k (Z) = inf E k [exp(-^yh ■ AS k+1 )g k+ i(Z k+1 )\Z k = Z] (80) 

h£R d 

exp(y b k (Z)) = inf E k [exp(-”/(h ■ AS k+1 - b k+1 (Z k+1 ))\Z k = Z] (81) 

h&R d 

and the optimal h defines the function h k+ i(Z). 

5 The exponential utility allocation algorithm 

In this section we introduce a Monte Carlo method for learning the optimal 
trading strategy © for the discrete time Markovian problems discussed in 
the previous section. We want an algorithm which will generate an approx¬ 
imate trading rule, based on a data set {Z k } i=1: __^N;k=o,...,K where Z\ G M n 
denotes the “state ” of the it\i sample path at time t k = kT / K. 

Consider the discrete time problem (|72|) for a general utility function U 
satisfying assumption The optimal portfolio H l k+l e M d should be selected 
as h k+ i(Xl, Z\) where X l k is the wealth held at the point (i,k). We can 
see a basic difficulty with a Monte Carlo approach: to “learn” the function 
h k+ 1 from the data {Z} will require being able to fill in the optimal wealth 
from time f 0 to t k . Then, conditionally upon knowing the wealth X k at time 
t k , finding the function h k+ i requires dynamic programming backwards from 
time tx — T to t k . In other words, a Monte Carlo learning algorithm for 
H k based on general utility will require both forward and backward dynamic 
programming. We have no effective method to suggest for general utility. 

By contrast, in the special case of exponential utility, the theoretical 
optimal rule H l k+l = hj(Z l k ) depends only on the directly observed data {Z l k j 
and is independent of the wealth X k . For this reason our algorithm works 
only for exponential utility, and we take 

U(x) = -, 
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for simplicity. 


5.1 The algorithm 

1. Step k = K : The final optimal allocation Hr is defined to be the 
M d -valned J-r -i random variable which solves 

min £'^_ 1 [exp (—H ■ ASr + B)], (a.s.) 

which is easily seen to be equivalent to the optimizer of 

min E[exp(—H • A S K + B)\ (82) 

Since the solution is known to be given by H K = Jik(Z k _i) for some 
deterministic function Iir G B(S) (the set of Borel functions on S), we 
write this as 

min T/A/r) (83) 

hec(S) 

where H/^(/i) := E[exp(—h(Z r _i) ■ A Sr + B)). On a finite set of data, 
we can pick an /f-dimensional subspace 7 Z(S) C B(S) of functions on 
S and attempt to “learn” a suboptimal solution 

h,R = argmin ^ r{K) 
h&l(S) 

By the central limit theorem, the expectation \I7 K-i{h) for h in a neigh¬ 
bourhood of h™, and hence the solution h 1 ^ itself, can be approximated 
by the finite sample estimate 

1 N 

'HkW = - J2 (-h(Z‘ K _ i). A Si, + tfzjf)) (84) 

i =1 

This leads to the estimator h 1 ^ based on {Z l k } and the choice of subspace 
7 Z defined by 

h% — argmin ^k{K) (85) 

heTZ(S) 



2. Inductive step for k = K — 1,... ,2: The estimate h B of the optimal 
rule hk, for 2 < k < K — 1 is determined inductively given the estimates 
h B + 1 ,..., h™. It is defined to be 

hf = argmin V k (h\hf +1 , ... ,h%) (86) 

hen(S) 


where 


^k(h; hf +1 ,h%) — 

1 N ( 

E ex p ( ~ h ( z i) ■ As i+i 


E h ?( Z j) ■ AS i + *( 4 :) 

j=k +1 


(87) 


3. Final step k — 1: This step is degenerate since the initial values Z 0 are 
constant over the sample. Therefore we determine the optimal constant 
vector hi G W l by solving 

hi = arg min ti % r ..., h%) (88) 

heR d 


To summarize, the algorithm above learns a collection of functions of the 
form (hi, ,.., h^) G W l x H(S) A_1 from the Monte Carlo simulation. 
This collection defines a suboptimal allocation strategy for the exponential 
hedging problem. Finally, the optimal value 'F 1 (/i 1 ; h ..., h 1 ^) is an esti¬ 
mate of the quantity e B ° , where B 0 is the certainty equivalent value of the 
claim B at time t = 0. 


5.2 Discussion of errors 

It is important to identify two distinct systematic sources of error in the 
algorithm. The first, which we call approximation one , is in focusing on 
suboptimal solutions h B which lie in a specified subspace 7 Z(S) of the full 
space B{S). From a pragmatic perspective, we need to select a set of R basis 
functions f \,... , fa for 7Z(S) which does a good job of representing the true 
optimal function over the values of state space covered by the Monte Carlo 
simulation. Naively, one might expect to need to choose R exponentially 
related to the dimension of S' experience seems to indicate far fewer functions 
are needed for higher dimension problems. For a discussion of this type of 
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question in the context of the Longstaff-Schwartz (LS) method for American 
options, see [| 21 | and J5[. Observe that the requirements of our algorithm 
are much more stringent than for the American option problem, since the 
strategy to be learned is not simply “to exercise or not to exercise”, but must 
select a high dimensional vector at each point (i, k ) in the simulation. Having 
said this, we take the point of view that the careful selection of a subspace 
7 Z(S) might lead to good performance of the algorithm. Furthermore, our 
experiments show that the sensitivity to changes in 7Z(S) of quantities such 
as indifference prices are much less than that of quantities such as hedge 
allocations. 

The second source of error, approximation two, is the finite N approxi¬ 
mation. We can in principle estimate this error in terms of the basic model 
parameters; the following is a heuristic argument to give a flavour of the 
problem for the kth step, k < K. By the central limit theorem, for a given 
confidence level 1 — a, a <C 1, there exist constants C\, C 2 so that 


IUM - *WII < ^=, Iiv*(fc) - v*(fc)|| < -A 

with probability 1 — a, for h in a convex neighbourhood of the true crit¬ 
ical point h n , defined by W^>{h TL ) = 0. We suppose that the estimated 
critical point h n , defined by V'h(h 7? ') = 0, lies in this neighbourhood, and 
furthermore the operator inequalities 0 < C 3 < V 2 T < C 4 hold on the same 
neighbourhood. Then one immediately derives the inequalities 

||ft K -A K || < (89 ) 

O3V -/v 

< ^ + 5^ (so) 

which show convergence of h n to h n for A^ —> 00 . 

The above discussion addresses the errors made at the fcth time step of 
the algorithm. Further study is needed to understand how errors accumulate 
as k is iterated. The answer to this question will give guidance on how 
to distribute computational effort over the different time steps, and can be 
expected to parallel the same question as it arises for the LS algorithm. 
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6 Numerical implementation 


We have tested the algorithm in the simple problem of investment with expo¬ 
nential utility U(x) = —e~ x in a stock which behaves as a geometric Brownian 
motion, with and without the purchase of a single at-the-money European 
put option. We have seen there is an exact solution to this problem which 
can be compared in detail to the solution generated by the Monte Carlo 
algorithm. 

We consider the model of (p5|) with d — 1 and parameters S 0 = 1, /i = 0.1, 
a = 0.2 and r = 0.0 over the period of one year T — 1. We apply the 
allocation algorithm to two scenarios involving portfolio selection at discrete 
time intervals of 1/50 (i.e. weekly): i) the Merton investment problem; 
and ii) the hedging problem for the buyer of a single written at-the-money 
European put. In each case we apply the method for simulations of length 
N = 1000,10000,100000. Then, for comparison to theory, we use the same 
Monte Carlo simulations, but rehedged weekly according to the theoretical 
formula (|70|), with — H B equal to the Black-Scholes delta of the option. 

Our results are displayed in figures 1 to 5. Figures 1,2,3,4 show the 
profit/loss distributions at time T for the learned Merton, learned put option, 
true Merton and true put option cases respectively. They show the empirical 
distributions for N = 1000,10000,100000. Figure 5 shows the values of 
the hedge ratio along a single sample path calculated according to both the 
strategy learned with N = 100000 and the true strategy. 

For comparison of their performances, we tabulate below the mean and 
the standard deviation of these distributions in each of the four cases, as well 
as the final expected exponential utility with parameters 7 = 1/4 (Ui), 7 = 1 
(C/2) and 7 = 4 (C/3), corresponding to an increasing order of risk-aversion. 
As measures of the risk associated with each case, we also tabulate their 
value-at-risk and conditional value-at-risk for 90% (VaRg 0 and CVar 90 ) and 
99% (VaRgg and CVar 9 g) confidence levels. 

From the C4 values on the table, one can derive the learned estimates of 
the indifference price 0.0767, 0.0790 and 0.0792 for the cases N = 1000,10000 
and 100000. Using the true strategy leads to the values 0.0798, 0.0796, 0.0795, 
respectively. The theoretical Black-Scholes price is 0.0797. 
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Case 

Mean 

St. Dev. 

U\ 

u 2 

u. 

VaRgg 

VaRgo 

CVaRgg 

CVaRgo 

la 

0.3572 

0.5778 

-0.9241 

-0.8262 

-3.3023 

-1.0829 

-0.3659 

-1.2053 

-0.6495 

lb 

0.2768 

0.5136 

-0.9409 

-0.8674 

-3.5224 

-1.0159 

-0.3743 

-1.2144 

-0.6543 

lc 

0.2528 

0.5013 

-0.9462 

-0.8810 

-2.7684 

-0.9209 

-0.3913 

-1.0913 

-0.6318 

2 a 

0.4349 

0.5797 

-0.9064 

-0.7652 

-2.4748 

-0.9828 

-0.3012 

-1.1348 

-0.5720 

2 b 

0.3562 

0.5142 

-0.9224 

-0.8015 

-2.9355 

-0.9518 

-0.2732 

-1.1756 

-0.5626 

2 c 

0.3325 

0.5020 

-0.9275 

-0.8139 

-2.4439 

-0.8532 

-0.2859 

-1.0633 

-0.5387 

3a 

0.2307 

0.4898 

-0.9511 

-0.8956 

-2.5283 

-0.9723 

-0.3961 

-1.0318 

-0.6430 

3b 

0.2524 

0.4945 

-0.9460 

-0.8773 

-2.4184 

-0.8852 

-0.3778 

-1.0429 

-0.6096 

3c 

0.2506 

0.4995 

-0.9466 

-0.8816 

-2.6461 

-0.9081 

-0.3896 

-1.0733 

-0.6254 

4a 

0.3108 

0.4904 

-0.9322 

-0.8269 

-1.8302 

-0.8878 

-0.3135 

-0.9492 

-0.5628 

4b 

0.3322 

0.4954 

-0.9274 

-0.8102 

-1.7521 

-0.8054 

-0.2979 

-0.9616 

-0.5289 

4c 

0.3304 

0.5005 

-0.9279 

-0.8142 

-1.9178 

-0.8274 

-0.3098 

-0.9921 

-0.5449 


Tabic 1: Mean, standard deviation, final expected utilities and risk measures 
for the profit/loss distribution of the learned Merton, learned put option, 
true Merton and true put option portfolios with (a) 1000, (b) 10000 and 
(c) 100000 Monte Carlo simulations of stock prices following a geometric 
Brownian motion. 

7 Discussion 

This paper seeks to bridge the gap between the theory of exponential hedg¬ 
ing in incomplete markets and the numerical implementation of that the¬ 
ory. Utility based hedging introduces several key concepts, notably certainty 
equivalent values and indifference prices which have no counterpart in com¬ 
plete markets. Therefore we have little experience or intuition on which to 
base our understanding of optimal trading in these markets. The simple and 
flexible Monte Carlo algorithm we introduce in this paper provides a test 
bed for realizing the theory of exponential hedging in essentially any market 
model. For example, problems involving American style early-exercise op¬ 
tions can in principle be easily included in our framework by following the 
Longstaff-Schwartz Monte Carlo method ]2T[ . Using our method for a variety 
of problems should help one gain intuition and understanding of how expo¬ 
nential hedging works in practice and how it compares with other hedging 
approaches. 

Our preliminary study of the geometric Brownian model shows not un- 
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expectedly that the method performs better for pricing than hedging. Inter¬ 
estingly the indifference price, perhaps the key theoretical concept, appears 
to be better approximated than the two certainty equivalent values which 
define it. On the other hand, as we see from the sample path shown in figure 
||, the actual hedging strategy learned by the algorithm deviates a lot from 
the theoretical strategy along individual stock trajectories, and cannot be 
seen as reliable. 

Predictably, the basic method we use shows some distinct shortcomings 
which prevent it from being taken as a de jure guide to real trading. Approx¬ 
imation one arises by restricting possible hedge strategies to a low dimen¬ 
sional subspace. It is clear that such a restriction will often lead to unsuitable 
strategies. However, we feel that approximation two, the finite sample size 
error, will likely be even more problematic for practical realizations. A brief 
study of the size of the constants which enter the estimates (^9|) and (|90|) 
suggests that reliable learned strategies will demand a very large value of 
N. In our simulations, N = 100000 gave reliable prices, but not hedging 
strategies. A third difficulty we noticed arising in our method is that learned 
strategies fluctuate far too much in time. Some simple smoothing procedure 
in time might lead to a marked improvement in hedging. 

To conclude this discussion, it is worthwhile to revisit the way in which 
our method of dynamic programming (finding H by induction over K steps 
backwards in time) leads to computational efficiency compared to a more 
direct approach which seeks to compute the optimal hedging strategy H si¬ 
multaneously at all times. Fixing as before an A-dimensional subspace 7 Z(S) 
for the form of the hedging strategy at each time, direct optimization of a 
single convex function of K x R variables costs 0(NR 2 K 2 ) flops. By dynamic 
programming this is reduced to K sequential optimizations of functions of 
R variables which will take 0(NR 2 K) flops. Accuracy is preserved by dy¬ 
namic programming because the KR x KR Hessian matrix of the global 
optimization is approximately block diagonal over the individual time steps. 

Putting aside the obvious drawbacks of the algorithm, we can see that 
our very simple and direct method will shed light on most conceptual difficul¬ 
ties arising in exponential hedging in incomplete markets. It implements the 
spirit of dynamic programming and prices claims quite reliably, even if it can¬ 
not easily produce accurate estimates of hedging strategies. On these merits 
alone, we think our algorithm deserves much further study and refinement. 
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Merton’s problem using learned strategy 



Profit/Loss, 1000 simulations 



Profit/Loss, 10000 simulations 



Profit/Loss, 100000 simulations 


Figure 1: The profit/loss distribution of the learned investment portfolio, ob¬ 
tained from the exponential utility allocation algorithm as an approximated 
solution to Merton’s problem, evaluated on simulated stock prices following 
a geometric Brownian motion. 
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Hedging one bought put using learned strategy 



Profit/Loss, 1000 simulations 



Figure 2: The profit/loss distribution of the learned hedging portfolio for 
the buyer of one put option, obtained from the exponential utility allocation 
algorithm on simulated stock prices following a geometric Brownian motion. 
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Merton problem using true strategy 



Profit/Loss, 1000 simulations 




Figure 3: The profit/loss distributions of optimal investment portfolio, ob¬ 
tained as the exact solution for Merton’s problem with exponential utility, 
evaluated on simulated stock prices following a geometric Brownian motion. 
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Hedging one bought put using true strategy 
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Profit/Loss, 1000 simulations 



Profit/Loss, 10000 simulations 



Profit/Loss, 100000 simulations 


Figure 4: The profit/loss distributions of the optimal hedging portfolio for the 
buyer of one put option, obtained from Black-Scholes delta hedging combined 
with Merton’s problem with exponential utility, evaluated on simulated stock 
prices following a geometric Brownian motion. 



























































True and learned hedge ratios along sample path for one bought put 



Figure 5: The hedge ratio (number of shares held) for the buyer of one 
put option on a simulated sample path of duration one year, for which the 
option matures in-the-money. The solid line shows the strategy learned 
with N=100000; the broken line shows the theoretical Black-Scholes-Merton 
strategy. 
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